Goto

Collaborating Authors

 Montgomery County


GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers

Hwang, Hochul, Yang, Soowan, Monon, Jahir Sadik, Giudice, Nicholas A, Lee, Sunghoon Ivan, Biswas, Joydeep, Kim, Donghyun

arXiv.org Artificial Intelligence

While commendable progress has been made in user-centric research on mobile assistive systems for blind and low-vision (BLV) individuals, references that directly inform robot navigation design remain rare. To bridge this gap, we conducted a comprehensive human study involving interviews with 26 guide dog handlers, four white cane users, nine guide dog trainers, and one O\&M trainer, along with 15+ hours of observing guide dog-assisted walking. After de-identification, we open-sourced the dataset to promote human-centered development and informed decision-making for assistive systems for BLV people. Building on insights from this formative study, we developed GuideNav, a vision-only, teach-and-repeat navigation system. Inspired by how guide dogs are trained and assist their handlers, GuideNav autonomously repeats a path demonstrated by a sighted person using a robot. Specifically, the system constructs a topological representation of the taught route, integrates visual place recognition with temporal filtering, and employs a relative pose estimator to compute navigation actions - all without relying on costly, heavy, power-hungry sensors such as LiDAR. In field tests, GuideNav consistently achieved kilometer-scale route following across five outdoor environments, maintaining reliability despite noticeable scene variations between teach and repeat runs. A user study with 3 guide dog handlers and 1 guide dog trainer further confirmed the system's feasibility, marking (to our knowledge) the first demonstration of a quadruped mobile system retrieving a path in a manner comparable to guide dogs.


Trace Regularity PINNs: Enforcing $\mathrm{H}^{\frac{1}{2}}(\partial Ω)$ for Boundary Data

Kim, Doyoon, Song, Junbin

arXiv.org Artificial Intelligence

We propose an enhanced physics-informed neural network (PINN), the Trace Regularity Physics-Informed Neural Network (TRPINN), which enforces the boundary loss in the Sobolev-Slobodeckij norm $H^{1/2}(\partial Ω)$, the correct trace space associated with $H^1(Ω)$. We reduce computational cost by computing only the theoretically essential portion of the semi-norm and enhance convergence stability by avoiding denominator evaluations in the discretization. By incorporating the exact $H^{1/2}(\partial Ω)$ norm, we show that the approximation converges to the true solution in the $H^{1}(Ω)$ sense, and, through Neural Tangent Kernel (NTK) analysis, we demonstrate that TRPINN can converge faster than standard PINNs. Numerical experiments on the Laplace equation with highly oscillatory Dirichlet boundary conditions exhibit cases where TRPINN succeeds even when standard PINNs fail, and show performance improvements of one to three decimal digits.


Infinite-Dimensional Operator/Block Kaczmarz Algorithms: Regret Bounds and $λ$-Effectiveness

Jeong, Halyun, Jorgensen, Palle E. T., Kwon, Hyun-Kyoung, Song, Myung-Sin

arXiv.org Machine Learning

We present a variety of projection-based linear regression algorithms with a focus on modern machine-learning models and their algorithmic performance. We study the role of the relaxation parameter in generalized Kaczmarz algorithms and establish a priori regret bounds with explicit $λ$-dependence to quantify how much an algorithm's performance deviates from its optimal performance. A detailed analysis of relaxation parameter is also provided. Applications include: explicit regret bounds for the framework of Kaczmarz algorithm models, non-orthogonal Fourier expansions, and the use of regret estimates in modern machine learning models, including for noisy data, i.e., regret bounds for the noisy Kaczmarz algorithms. Motivated by machine-learning practice, our wider framework treats bounded operators (on infinite-dimensional Hilbert spaces), with updates realized as (block) Kaczmarz algorithms, leading to new and versatile results.


The best approximation pair problem relative to two subsets in a normed space

Reem, Daniel, Censor, Yair

arXiv.org Artificial Intelligence

In the classical best approximation pair (BAP) problem, one is given two nonempty, closed, convex and disjoint subsets in a finite- or an infinite-dimensional Hilbert space, and the goal is to find a pair of points, each from each subset, which realizes the distance between the subsets. We discuss the problem in more general normed spaces and with possibly non-convex subsets, and focus our attention on the issues of uniqueness and existence of the solution to the problem. As far as we know, these fundamental issues have not received much attention. We present several sufficient geometric conditions for the (at most) uniqueness of a BAP. These conditions are related to the structure and the relative orientation of the boundaries of the subsets and to the norm. We also present many sufficient conditions for the existence of a BAP. Our results significantly extend the horizon of a recent algorithm for solving the BAP problem [Censor, Mansour, Reem, J. Approx. Theory (2024)]. The paper also shows, perhaps for the first time, how wide is the scope of the BAP problem in terms of the scientific communities which are involved in it (frequently independently) and in terms of its applications.



Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets

Stöckermann, Patrick, Südfeld, Henning, Immordino, Alessandro, Altenmüller, Thomas, Wegmann, Marc, Gebser, Martin, Schekotihin, Konstantin, Seidel, Georg, Chan, Chew Wye, Zhang, Fei Fei

arXiv.org Artificial Intelligence

Benchmark datasets are crucial for evaluating approaches to scheduling or dispatching in the semiconductor industry during the development and deployment phases. However, commonly used benchmark datasets like the Minifab or SMT2020 lack the complex details and constraints found in real-world scenarios. To mitigate this shortcoming, we compare open-source simulation models with a real industry dataset to evaluate how optimization methods scale with different levels of complexity. Specifically, we focus on Reinforcement Learning methods, performing optimization based on policy-gradient and Evolution Strategies. Our research provides insights into the effectiveness of these optimization methods and their applicability to realistic semiconductor frontend fab simulations. We show that our proposed Evolution Strategies-based method scales much better than a comparable policy-gradient-based approach. Moreover, we identify the selection and combination of relevant bottleneck tools to control by the agent as crucial for an efficient optimization. For the generalization across different loading scenarios and stochastic tool failure patterns, we achieve advantages when utilizing a diverse training dataset. While the overall approach is computationally expensive, it manages to scale well with the number of CPU cores used for training. For the real industry dataset, we achieve an improvement of up to 4% regarding tardiness and up to 1% regarding throughput. For the less complex open-source models Minifab and SMT2020, we observe double-digit percentage improvement in tardiness and single digit percentage improvement in throughput by use of Evolution Strategies.


Conceptual Logical Foundations of Artificial Social Intelligence

Werner, Eric

arXiv.org Artificial Intelligence

What makes a society possible at all? How is coordination and cooperation in social activity possible? What is the minimal mental architecture of a social agent? How is the information about the state of the world related to the agents intentions? How are the intentions of agents related? What role does communication play in this coordination process? This essay explores the conceptual and logical foundations of artificial social intelligence in the context of a society of multiple agents that communicate and cooperate to achieve some end. An attempt is made to provide an introduction to some of the key concepts, their formal definitions and their interrelationships. These include the notion of a changing social world of multiple agents. The logic of social intelligence goes beyond classical logic by linking information with strategic thought. A minimal architecture of social agents is presented. The agents have different dynamically changing, possible choices and abilities. The agents also have uncertainty, lacking perfect information about their physical state as well as their dynamic social state. The social state of an agent includes the intentional state of that agent, as well as, that agent's representation of the intentional states of other agents. Furthermore, it includes the evaluations agents make of their physical and social condition. Communication, semantic and pragmatic meaning and their relationship to intention and information states are investigated. The logic of agent abilities and intentions are motivated and formalized. The entropy of group strategic states is defined.


The common ground of DAE approaches. An overview of diverse DAE frameworks emphasizing their commonalities

Schwarz, Diana Estévez, Lamour, René, März, Roswitha

arXiv.org Artificial Intelligence

We analyze different approaches to differential-algebraic equations with attention to the implemented rank conditions of various matrix functions. These conditions are apparently very different and certain rank drops in some matrix functions actually indicate a critical solution behavior. We look for common ground by considering various index and regularity notions from literature generalizing the Kronecker index of regular matrix pencils. In detail, starting from the most transparent reduction framework, we work out a comprehensive regularity concept with canonical characteristic values applicable across all frameworks and prove the equivalence of thirteen distinct definitions of regularity. This makes it possible to use the findings of all these concepts together. Additionally, we show why not only the index but also these canonical characteristic values are crucial to describe the properties of the DAE.



BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode

Li, Zongrong, Su, Yunlei, Zhu, Chenyuan, Zhao, Wufan

arXiv.org Artificial Intelligence

Urban Building Exteriors are increasingly important in urban analytics, driven by advancements in Street View Imagery and its integration with urban research. Multimodal Large Language Models (LLMs) offer powerful tools for urban annotation, enabling deeper insights into urban environments. However, challenges remain in creating accurate and detailed urban building exterior databases, identifying critical indicators for energy efficiency, environmental sustainability, and human-centric design, and systematically organizing these indicators. To address these challenges, we propose BuildingView, a novel approach that integrates high-resolution visual data from Google Street View with spatial information from OpenStreetMap via the Overpass API. This research improves the accuracy of urban building exterior data, identifies key sustainability and design indicators, and develops a framework for their extraction and categorization. Our methodology includes a systematic literature review, building and Street View sampling, and annotation using the ChatGPT-4O API. The resulting database, validated with data from New York City, Amsterdam, and Singapore, provides a comprehensive tool for urban studies, supporting informed decision-making in urban planning, architectural design, and environmental policy. The code for BuildingView is available at https://github.com/Jasper0122/BuildingView.